End-to-End Generation of Written-style Transcript of Speech from Parliamentary Meetings

نویسندگان

چکیده

従来の音声認識システムは，入力音声に現れるすべての単語を忠実に再現するように設計されているため，認識精度が高いときでも，人間にとって読みやすい文を出力するとは限らない．これに対して，本研究では，フィラーや言い誤りの削除，句読点や脱落した助詞の挿入，また口語的な表現の修正など，適宜必要な編集を行いながら，音声から直接可読性の高い書き言葉スタイルの文を出力する新しい音声認識のアプローチについて述べる．我々はこのアプローチを単一のニューラルネットワークを用いた音声から書き言葉への end-to-end 変換として定式化する．また，音声に忠実な書き起こしを疑似的に復元し，end-to-end モデルの学習を補助する手法と，句読点位置を手がかりとした新しい音声区分化手法も併せて提案する．700 時間の衆議院審議音声を用いた評価実験により，提案手法は音声認識とテキストベースの話し言葉スタイル変換を組み合わせたカスケード型のアプローチより高精度かつ高速に書き言葉を生成できることを示す．さらに，国会会議録作成時に編集者が行う修正作業を分類・整理し，これらについて提案システムの達成度と誤り傾向の分析を行う．

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the aesthetic dimension of howard barkers art: a frankfurtian approach to scenes from an execution and no end of blame

رابطه ی میانِ هنر و شرایطِ اجتماعیِ زایش آن همواره در طولِ تاریخ دغدغه ی ذهنی و دل مشغولیِ اساسیِ منتقدان و نیز هنرمندان بوده است. از آنجا که هنر در قفس آهنیِ زندگیِ اجتماعی محبوس است، گسترش وابستگیِ آن با نهاد ها و اصولِ اجتماعی پیرامون، صرفِ نظر از هم سو بودن و یا غیرِ هم سو بودنِ آن نهاد ها، امری اجتناب ناپذیر به نظر می رسد. با این وجود پدیدار گشتنِ چنین مباحثِ حائز اهمییتی در میان منتقدین، با ظهورِ مکتب ما...

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

In this work, we propose “global style tokens” (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-toend speech synthesis system. The embeddings are trained with no explicit labels, yet learn to model a large range of acoustic expressiveness. GSTs lead to a rich set of significant results. The soft interpretable “labels” they generate can be used to con...

متن کامل

Comparison of nerve repair with end to end, end to side with window and end to side without window methods in lower extremity of rat

Abstract Background : Although, different studies on end-to-side nerve repair, results are controversial. The importance of this method in case is unavailability of proximal nerve. In this method, donor nerves also remain intact and without injury. In compare to other classic procedures, end-to-side repair is not much time consuming and needs less dissection. Overall, the previous studies i...

متن کامل

End-to-end esophagojejunostomy versus standard end-to-side esophagojejunostomy: which one is preferable?

Abstract Background: End-to-side esophagojejunostomy has almost always been associated with some degree of dysphagia. To overcome this complication we decided to perform an end-to-end anastomosis and compare it with end-to-side Roux-en-Y esophagojejunostomy. Methods: In this prospective study, between 1998 and 2005, 71 patients with a diagnosis of gastric adenocarcinoma underwent total gastrec...

متن کامل

End-to-End Neural Speech Synthesis

In recent years, end-to-end neural networks have become the state of the art for speech recognition tasks and they are now widely deployed in industry (Amodei et al., 2016). Naturally, this has led to the creation of systems to do the opposite – end-to-end speech synthesis from raw text. Very recently, neural TTS systems have become highly competitive with their conventional counterparts, showi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Shizen gengo shori

سال: 2023

ISSN: ['1340-7619', '2185-8314']

DOI: https://doi.org/10.5715/jnlp.30.88